Validation of language resources in TC-STAR
نویسندگان
چکیده
In TC-STAR a variety of Language Resources (LR) are being produced. In this contribution we address the validation of resources that were created and used for the second Evaluation Campaign of the project. For the three types of topics covered by the project (ASR, SLT, TTS) the validation of both development and evaluation sets is described. For each type we successively address the description of the data, the validation procedures and the validation results. It is concluded that validation constitutes an important and useful element in the production of high quality TC-STAR language resources.
منابع مشابه
TC-STAR: New language resources for ASR and SLT purposes
In TC-STAR a variety of Language Resources (LR) is being produced. In this contribution we address the resources that have been created for Automatic Speech Recrognition and Spoken Language Translation. As yet, these are 14 LR in total: two training SLR for ASR (English and Spanish), three development LR and three evaluation LR for ASR (English, Spanish, Mandarin), and three development LR and ...
متن کاملDevelopment, Factor Analysis, and Validation of an EFL Teacher Change Scale (TCS)
The concept of teacher change is critical in second language teaching and English as a Foreign Language (EFL) context due largely to the fact that, almost, whatever we do in teacher education looks for initiating change of one sort or another. A substantial body of research has been dedicated to investigate teacher change (TC) from various perspectives. However, having studied the related lite...
متن کاملTC-STAR: Specifications of Language Resources and Evaluation for Speech Synthesis
In the framework of the EU funded project TC-STAR (Technology and Corpora for Speech to Speech Translation), research on TTS aims on providing a synthesized voice sounding like the source speaker speaking the target language. To progress in this direction, research is focused on naturalness, intelligibility, expressivity and voice conversion both, in the TC-STAR framework. For this purpose, spe...
متن کاملCreating Slovenian Language Resources for Development of Speech-to-speech Translation Components
Article brings detailed information about procedures of building Slovenian lexica within the LC-STAR project, and also detailed information about the size of that lexica. University of Maribor joined the LC-STAR project in order to provide appropriate language resources for developing speech-to-speech translation technology for Slovenian language. Lexica exists from three parts: 65.000 common w...
متن کاملExploring XML-based technologies and procedures for quality evaluation from a real-life case perspective
The use of Extensible Markup Language (XML) for the annotation of Spoken Language Resources (SLR) is becoming increasingly common these days. Therefore the Speech Processing EXpertise centre (SPEX), which is the SLR validation centre of the European Language Resources Association (ELRA), is also being confronted more with XML. The project “Lexica and Corpora for Speech-to-Speech Translation Com...
متن کامل